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Paenibacillus 5p.Y412MC10 was one of a number of organisms isolated from Obsidian Hot 
Spring, Yellowstone National Park, Montana, USA under permit from the National Park Ser- 
vice. The isolate was initially classified as a Geobacillus sp. Y412MC10 based on its isolation 
conditions and similarity to other organisms isolated from hot springs at Yellowstone National 
Park. Comparison of 1 6 S rRNA sequences within the Bacillales indicated that Geobacillus 
sp.Y412MC10 clustered with Paenibacillus species, and the organism was most closely relat- 
ed to Paenibacillus lautus. Lucigen Corp. prepared genomic DNA and the genome was se- 
quenced, assembled, and annotated by the DOE Joint Genome Institute. The genome se- 
quence was deposited at the NCBI in October 2009 (NC_013406). The genome of 
Paenibacillus sp. Y412MC10 consists of one circular chromosome of 7,121,665 bp with an 
average G+C content of 51 .2%. Comparison to other Paenibacillus species shows the organ- 
ism lacks nitrogen fixation, antibiotic production and social interaction genes reported in 
other paenibacilli. The Y412MC10 genome shows a high level of synteny and homology to 
the draft sequence of Paenibacillus sp. HGF5, an organism from the Human Microbiome Pro- 
ject (HMP) Reference Genomes. This, combined with genomic CAZyme analysis, suggests an 
intestinal, rather than environmental origin for Y412MC10. 



Abbreviations: YT, yeast extract tryptone broth, LB, Luria broth, TB, Terrific Broth 



Introduction 



Numerous novel microorganisms have been iso- 
lated from hot springs in Yellowstone National 
Park. Many of these organisms have been shown 
to possess enzymes with significant potential in 
biotechnological applications [1]. Among the or- 
ganisms first isolated from Yellowstone hot 
springs are Thermus aquaticus [2,3], Thermus 
brockianus [4], Acidothermus cellulollyticus [5], and 
Synechococcus species [6]. As part of a project in 
conjunction with the Department of Energy Joint 



Genome Institute, Lucigen Corp. isolated, charac- 
terized, and sequenced a number of new isolates 
from Yellowstone hot springs. The bacterial iso- 
late Y412MC10 was one of four microorganisms 
isolated from Obsidian Hot Spring, Yellowstone 
National Park, Montana, USA and submitted for 
whole genome sequencing. Y412MC10 was initial- 
ly classified as a Geobacillus sp. based on its isola- 
tion conditions and morphological similarity to 
other organisms such as Geobacillus species 
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Y412MC61 (GenBank 544556), Y412MC52 
(GenBank 550542), and Geobacillus 
thermoglucosidasius C56-YS93 (GenBank 634956). 
The Geobacillus 5p.Y412MC10 draft genome se- 
quence was deposited at the NCBI in October 2009 
(NC_013406) with the lineage entry indicating 
that it is a Geobacillus. Following assembly of the 
complete genome of Y412MC10, the 16S rRNA se- 
quence and genome properties properly assigned 
the organism as a Paenibacillus sp. Y412MC10 
represents the first Paenibacillus sp. isolated from 
a hot spring to have its genome completely se- 
quenced. 

Paenibacillus sp. were originally grouped in the 
genus Bacillus until 1993, when Ash et al. [7] pro- 
posed that members of "group 3" should be trans- 
ferred to the genus Paenibacillus, and proposed 
Paenibacillus polymyxa as the type species. 
Paenibacillus sp. have been isolated from a wide 
range of environments including soil [8], the Ant- 
arctic [9], and the oral cavity of a dog [10]. 
Paenibacillus sp. are of interest for a number of 
reasons, including production of antibiotics [11- 
13], biopolymer-degrading enzymes [14-16], and 
their ability to fix nitrogen [17,18]. One species, P. 
vortex shows highly unusual organized growth 
morphologies on solid surfaces [19,20]; another 
species, P. dendritiformis, also shows unusual 
growth morphologies on solid surfaces [19,21,22]. 
Comparison of the genetic content of Y412MC10 
with genomes of Paenibacillus sp. from other envi- 
ronments will give insights into the evolutionary 
adaptations that have occurred in the 
Paenibacillus. The organism also may be a source 
of novel polysaccharide degrading enzymes for 
use in biomass degradation. 

Classification and features 

A phylogenetic tree was constructed to identify the 
family relationship of strain Y412MC10 (Figure 1). 
The tree was created using BLAST2Tree software 
[23]. The analysis was carried out using only type 
strains of validly-named organisms, and the analy- 
sis shows that Y412MC10 does not clade with 
known Geobacillus species. Rather, Y412MC10 
clades in the Paenibacillus genus. Based on rl6S 
analysis of validly-named organisms, Y412MC10 is 



most closely related to Paenibacillus lautus DSM 
3055 T (AB073188). The classification of the isolate 
was confirmed using the EzTaxon-e server [24], 
again on the basis of 16S rRNA sequence data. 
When compared to their entire rl6S database, 
Y412MC10 was identified as being a strain of 
Paenibacillus lautus with 99.09% identity and 100% 
completeness to the rl6S of the type strain, 
Paenibacillus lautus NRRL NRS-666 GenBank 
D78472. 

Paenibacillus sp. Y412MC10 was one of a number 
of organisms isolated from Obsidian Hot Spring, 
Yellowstone National Park, Montana, USA 
(44.6100594° latitude and -110.4388217° longi- 
tude) under a sampling permit from the National 
Park Service. The hot spring possesses a pH of 
6.37 and a temperature range of 42-90°C. The or- 
ganism was isolated from a sample of hot spring 
water by enrichment and plating on YTP-2 medi- 
um [25] (YTP-2 media contains (per liter) 2.0 g 
yeast extract, 2.0 g tryptone, 2.0 g sodium py- 
ruvate, 1.0 g KC1, 2.0 g KN03, 2.0 g Na 2 HP0 4 .7H 2 0, 
0.1 g MgS0 4 , 0.03 g CaCl 2 , and 2.0 ml clarified to- 
mato juice) at 50°C. Culture stocks were routinely 
maintained on YT (containing (per liter) 5.0 g 
yeast extract, 8.0 g tryptone, and 2.5 g NaCl) agar 
plates at 37°C. As part of the sequencing agree- 
ment with the Joint Genome Institute, the culture 
is available without restrictions from the authors. 
Lucigen, the National Park Service, and the Joint 
Genome Institute have placed no restrictions on 
the use of the culture or sequence data. 
Y412MC10 is a Gram-positive facultative anaerobe 
(Table 1) that grows well on a wide variety of 
standard lab media (YT, TB, LB). On plates, the 
organism grows as rods or chains of rods (Figure 
2A). After growth for 6 days on plates, the cells 
still appear rod-shaped, but an extracellular ma- 
trix appears to surround and bind the individual 
cells together (light green background, Figure 2B). 
In liquid culture, the organism appears to also 
grow as a mixture of single cells and large clumps 
of cells surrounded by an extracellular matrix 
(Figure 2C). Prolonged growth on plates or in liq- 
uid culture results in sporulation of the culture; 
spores are subterminal with swollen sporangia. 
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Figure 1. Phylogenetic tree highlighting the position of Paenibacillus lautus strain Y412MC10 and relative strains 
within the Bacillales. Strains used for the comparison (1 6S rRNA accession number) were (Paenibacillus 
Y412MC10); Paenibacillus lautus DSM 3035 T (AB073188); Paenibacillus glucanolyticus DSM 51 62 T (AB073189); 
Paenibacillus illinoisensis DSM 11733 T (AB073192); Paenibacillus chibensis DSM 1 1 731 T (AB073194); 
Paenibacillus ginsengisoli LMG 23406 T (AB245382); Paenibacillus macquariensis subsp defensor NCIB 14397 T 
(AB360546); Paenibacillus kribbensis KCTC 0766BP T (AF391123); Paenibacillus daejeonensis KCCM 41557 T 
(AF391124); Paenibacillus borealis DSM 13188 T (AJ011322); Paenibacillus odorifer DSM 15391 T (AJ223990); 
Paenibacillus cookie LMG 18419 T (AJ250317); Paenibacillus jamilae DSM 13815 T (AJ2711 57); Paenibacillus 
azoreducens DSM 1 3822 T (AJ2 72249); Paenibacillus stellifer DSM 1 4472 T (AJ31 601 3); Paenibacillus peoriae DSM 
8320 T (AJ320494); Paenibacillus cineris LMG 18349 T (AJ575658); Paenibacillus antarcticus LMG 22078 T 
(AJ605292); Paenibacillus favisporus LMG 20987 T (AY208751); Paenibacillus lactis DSM 15596 T (AY257868); 
Paenibacillus provencensis CIP 109358 T (EF212893); Paenibacillus konsidanse KCTC 13165 T (EU081509). 
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Figure 2A. Micrograph of Paenibacillus strain Y412MC10 cells showing individ- 
ual cells and chains of cells. Cells were streaked on YT agar and incubated 18 
hr. at 37°C. A colony was removed, re-suspended in sterile water and stained 
using a 5uM solution of SvTO® 9 fluorescent stain in sterile water (Molecular 
Probes). Dark field fluorescence microscopy was performed using a Nikon 
Eclipse TE2000-S epifluorescence microscope at 2000x magnification using a 
high-pressure Hg light source and a 500 nm emission filter. 




Figure 2B. Micrograph of same plate of Paenibacillus strain Y412MC10 cells 
showing individual cells and chains of cells. Same plate as Figure 2A, but in- 
cubated 6 days at 37°C. 



http://standardsingenomics.org 



369 



Paenibacillus sp. Y41 2MC1 0 



Table 1 Classification and general features of Paenibacillus strain Y412MC10 



MIGS ID 


Property 


Term 


Evidence code 






Domain Bacteria 


TAS [26] 






Phylum Firmicutes 


TAS [27-29] 






Class Bacilli 


TAS [30,31] 




current ciassiTication 


Order Bacillales 


TAS [32,33] 






Family Bacillaceae 


TAS [32,34] 






Genus Paenibacillus 


TAS [35-39] 






Species Paenibacillus lautus 


TAS [40] 






Strain Y412MC10 


IDA 




Gram stain 


positive 


IDA 




Cell shape 


rods and chains of rods 


IDA 




Motility 


motile 


IDA 




Sporulation 


sporulating 


IDA 




Temperature range 


mesophilic 


IDA 




Optimum temperature 


37°C 


IDA 


MIGS-22 


Oxygen requirement 


facultative anaerobe 


IDA 




Carbon source 


Carbohydrate or protein 


IDA 




Energy source 


chemoorganotrophic 


IDA 




Electron acceptor 


Oxygen, nitrate 


IDA 


MIGS-6 


Habitat 


hot spring 


IDA 


MIGS-6.3 


Salinity 


Grows in 3% NaCl 


IDA 


MIGS-15 


Biotic relationship 


free-living 


IDA 


MIGS-14 


Pathogenicity 


None, BSL1 


IDA 




Isolation 


Obsidian spring 


IDA 


MIGS-4 


Geographic location 


Yellowstone National Park 


IDA 


MIGS-5 


Sample collection time 


September 2003 


IDA 


MIGS-4. 1 


Latitude 


44.6100594 


TAS [1] 


MIGS-4.2 


Longitude 


-110.4388217 


TAS [1] 


MIGS-4.3 


Depth 


Surface of spring 


IDA 


MIGS-4.4 


Altitude 


2416 m 


TAS [1] 



Evidence codes - IDA: Inferred from Direct Assay; TAS: Traceable Author Statement (i.e., 
a direct report exists in the literature); NAS: Non-traceable Author Statement (i.e., not di- 
rectly observed for the living, isolated sample, but based on a generally accepted proper- 
ty for the species, or anecdotal evidence). These evidence codes are from the Gene On- 
tology project [41]. 
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Figure 2C. Micrograph of Paenibacillus strain Y412MC10 cells showing indi- 
vidual cells and clumps of cells. Cells were grown in YTP-2 media for 1 8 hours 
at 37°C and 200 rpm. An aliquot was removed and stained using a 5 uM sol- 
tion of SYTO® 9 fluorescent stain in sterile water (Molecular Probes). Dark field 
fluorescence microscopy was performed using a Nikon Eclipse TE2000-S 
epifluorescence microscope at 200x magnification using a high-pressure Hg 
light source and 500 nm emission filter. 



Genome sequencing and annotation 

Genome project history 

Paenibacillus sp.Y412MC10 was selected for se- 
quencing on the basis of its biotechnological poten- 
tial as part of the U.S. Department of Energy's Ge- 
nomic Science program (formerly Genomics:GTL). 
The genome sequence is deposited in the Genomes 
On Line Database [42] (GOLD ID = Gc01127), and in 
GenBank (NCBI Reference Sequence = NC_013406). 
Sequencing, finishing and annotation were per- 
formed by the DOE Joint Genome Institute (JGI). A 
summary of the project information and its associa- 
tion with MIGS identifiers is shown in Table 2. 

Growth conditions and DNA Isolation 

For preparation of genomic DNA, liter cultures of 
Y412MC10 were grown from a single colony in 
YTP-2 medium at 37°C in flasks agitated at 200 
rpm and collected by centrifugation. The cell con- 
centrate was lysed using a combination of SDS and 
proteinase K, and genomic DNA was isolated using 
a phenol/chloroform extraction [43]. The genomic 
DNA was precipitated, and treated with RNase to 
remove residual contaminating RNA. 



Genome sequencing and assembly 

The genome of Paenibacillus lautus Y412MC10 was 
sequenced at the Joint Genome Institute (JGI) [44] 
using Sanger sequencing with a combination of 6 kb 
and 34 kb DNA libraries and 454 FLX 
pyrosequencing done to a depth of 20x coverage 
[45]. Both libraries provided 5.8x coverage of the 
genome. Draft assemblies were based on 39,162 to- 
tal reads. Solexa sequencing data was used to polish 
the assembly. All general aspects of library construc- 
tion and sequencing performed at the JGI can be 
found at their website. The Phred/Phrap/Consed 
software package [46] was used to assemble 6-kb 
and fosmid libraries and to assess quality. Possible 
mis-assemblies were corrected; gaps between 
contigs were closed by 2,744 primer walks from 
sub-clones or 83 PCR end reads, 5 mini-libraries, 
and 10 PCR shatter libraries. The error rate of the 
completed genome sequence was 0.08, based on 
49,558 total reads. Table 2 presents the project in- 
formation and its association with MIGS version 2.0 
compliance [47]. 
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Table 2. Genome sequencing and project information 



MIGS ID Property 



Term 



MIGS-31 Finishing quality 
MIGS-28 Libraries used 
MIGS-29 Sequencing platforms 
MIGS-31 .2 Fold coverage 
MIGS-30 Assemblers 
MIGS-32 Gene calling method 
GenBank ID 

Gen Bank Date of Release 
GOLD ID 
Project relevance 



Finished 

6 kb and 34 kb 

ABI3730, 454 Titanium, 

5.8 

Phred/Phrap/Consed 
Prodigal, GenePRIMP 
CP001 793.1 
October 7, 2009 
Gi02010 

Biotechnological 



lumina GAii 



Genome annotation 

Genes were identified using Prodigal [48] as part of 
the Oak Ridge National Laboratory genome annota- 
tion pipeline, followed by a round of manual 
curation using the JGI GenePRIMP pipeline [49]. The 
predicted CDSs were translated and used to search 
the National Center for Biotechnology Information 
(NCBI) nonredundant database, UniProt, TIGRFam, 
Pfam, PRIAM, KEGG, COG, and InterPro databases. 
These data sources were combined to assert a prod- 
uct description for each predicted protein. Non- 
coding genes and miscellaneous features were pre- 
dicted using tRNAscan-SE [49], RNAMMer [50], 
Rfam [51], TMHMM [52], and signalP [52]. 



Genome properties 

The genome of Paenibacillus lautus Y412MC10 con- 
sists of one circular chromosome of 7,121,665 bp 
with an average G+C content of 51.2% (Table 3 
and Figure 3). There are 73 tRNA genes, 24 rRNA 
genes and 4 "other" identified RNA gene. There are 
6,343 predicted protein-coding regions and 105 
pseudogenes in the genome. A total of 4,651 genes 
(72.2%) have been assigned a predicted function 
while the rest have been designated as hypothetical 
proteins. The numbers of genes assigned to each 
COG functional category are listed in Table 4. About 
20% of the annotated genes were either not as- 
signed to a COG or have an unknown function. 



Table 3. Genome statistics. 



Attribute 


Value 


% of total 3 


Genome size (bp) 


7,121,665 


100.0 


DNA coding region (bp) 


6,141,611 


86.2 


DNA G+C content (bp) 


3,649,102 


51.2 


Number of replicons 






Total genes 


6,444 


100.0 


RNA genes 


101 


1.6 


rRNA operons 






Protein-coding genes 


6,343 


98.4 


Pseudogenes 






Genes in paralog clusters 


1,599 


24.8 


Genes assigned to COGs 


4,651 


72.2 


Genes with signal peptides 


1,506 


23.4 


Genes with transmembrane helices 


1,921 


29.8 


Paralogous groups 


532 





a) The total is based on either the size of the genome in base pairs 
or the total number of protein coding genes in the annotated ge- 
nome. 
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Figure 3. Graphical circular map of the chromosome. From outside to the center: Genes on forward strand (color 
by COG categories), Genes on reverse strand (color by COG categories), RNA genes (tRNAs green, rRNAs red, 
other RNAs black), GC content, GC skew. 



Insights from the genome sequence 

Motility of Paenibacillus cells on solid media has 
been observed with a number of species. P. 
lautus is reported to spread across plates 69]. P. 
vortex shows highly unusual organized growth 
morphologies on solid surfaces [19,20] forming 
complex patterns on the plate. Another species, 
P. dendritiformis, also shows unusual growth 



morphologies on solid surfaces [19,21,22]. 
Y412MC10 was evaluated for spreading behavior 
on plates; the results ((Figure 4A Figure 4B Fig- 
ure AC] show definite spreading behavior for 
Y412MC10. The spreading behavior does not, 
however, appear to be as complex as reported for 
P. vortex and P. dendritiformis. 



http://standardsingenomics.org 



373 



Paenibacillus sp. Y41 2MC1 0 



Table 4. Number of genes associated with the 25 general COG functional categories. 



Code 


Value 


%age 


Description 


J 


224 


3.6 


Translation, ribosomal structure and biogenesis 


A 






RNA processing and modification 


K 


663 


10.6 


Transcription 


L 


152 


3.2 


Replication, recombination and repair 


B 


2 


0.03 


Chromatin structure and dynamics 


D 


213 


3.4 


Cell cycle control, cell division, chromosome partitioning 


Y 






Nuclear structure 


V 


244 


3.9 


Defense mechanisms 


T 


480 


7.7 


Signal transduction mechanisms 


M 


440 


7.0 


Cell wall/membrane/envelope biogenesis 


N 


130 


2.1 


Cell motility 


Z 


5 


0.1 


Cytoskeleton 


w 


0 


0.0 


Extracellular structures 


u 


41 


0.7 


Intracellular trafficking, secretion, and vesicular transport 


o 


331 


5.3 


Posttranslational modification, protein turnover, chaperones 


c 


415 


6.6 


Energy production and conversion 


G 


1,030 


16.5 


Carbohydrate transport and metabolism 


E 


788 


12.6 


Amino acid transport and metabolism 


F 


244 


3.9 


Nucleotide transport and metabolism 


H 


373 


6.0 


Coenzyme transport and metabolism 


I 


88 


1.4 


Lipid transport and metabolism 


P 


559 


9.0 


Inorganic ion transport and metabolism 


Q 


283 


4.5 


Secondary metabolites biosynthesis, transport and catabolism 


R 


872 


14.0 


General function prediction only 


S 


371 


6.0 


Function unknown 




0 


0.0 


Not in COGs 



The total is based on the total number of protein coding genes in the annotated genome. Data 
from [53]. 
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Figure 4A. Photograph of Paenibacillus sp. 
Y412MC10 streaked on YT agar containing 
100 mg/l carbenicillin and incubated at 
37°C for 18 hours. 




Figure 4B. Photograph of Paenibacillus 
sp. Y412MC10 streaked on YT agar and 
incubated at 37°C for 50 hours. Note 
continued clearing of center area and 
significant spreading of outside edges of 
culture. 




Figure 4C. Photograph of Paenibacillus sp. 
Y412MC10 streaked on YT agar and incu- 
bated at 37°C for 168 hours. Note contin- 
ued clearing of center area and significant 
spreading of outside edges of culture. 
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To further understand the phylogenetic relationships 
of Y412MC10, a separate phylogenetic tree was con- 
structed of Paenibacillus sp. having either complete or 
draft genome. This was necessary because there is 
little or no overlap between validly-named 
Paenibacillus strains and the Paenibacillus strains 
submitted for whole genome sequencing. The tree 
was created on the IMG website [54] using the align- 
ment of 16S genes based on the SILVA database and 
dnadist and neighbor tools from the Phylip package 
[55]. Paenibacillus sp. JDR-2, Paenibacillus 
curdlanolyticus YK9, Paenibacillus polymyxa E681, 



Paenibacillus polymyxa SC2, and Paenibacillus 
mucilaginosus KNP414 are all species isolated from 
soil or plant roots. Paenibacillus vortex V453, 
Paenibacillus sp. HGF5, Paenibacillus sp. HGF7, are 
oral or intestinal human isolates. Paenibacillus larvae 
subsp. larvae B-3650, 741161 is a honeybee pathogen 
that attacks bee larvae. The rl6S analysis (Figure 5) 
shows Y412MC10 is most closely related to 
Paenibacillus sp. HGF5 (NCBI Taxon ID 908341, Gold 
ID Gi05716), an organism being sequenced as part of 
the Human Microbiome Project (HMP) Reference Ge- 
nomes ( http://www.hmpdacc.org ). 



Puertibaciffus sp r JDR-2 

PaenibadSm curdiunoiytkus YK9 



r Paembadiius sp r Y412MC10 



Paembocittus sp. HGFB 



Paertibutiffus vortex V453 



— Pae/tiboci//us poiymyxa E6B1 
Paembucittus pofymyxa 5 €2 



PaenibaciHus larvae farvae 8-3650 



Poertibotitfus mucilaginosus KNP414 

P&e nibacittus sp r HGP7 



Bacittus subti/is subti/is 



Figure 5. Phylogenetic tree highlighting the position of Paenibacillus sp. Y412MC10 and relative strains within the 
Bacillales. The strains and their corresponding GenBank taxonomy ID numbers are: Paenibacillus sp. JDR-2, 
324057; Paenibacillus curdlanolyticus YK9, A71 7606; Paenibacillus sp. Y412MC10, 481743; Paenibacillus vortex 
V453, 715225; Paenibacillus polymyxa E681, 349520; Paenibacillus polymyxa SC2, 886882; Paenibacillus 
mucilaginosus KNP414, 1036673; Paenibacillus sp. HGF5, 908341; Paenibacillus sp. HGF7, 944559; 
Paenibacillus larvae subsp. larvae B-3650, 741 1 61 ; Bacillus subtilis subtilis 1 68, 2243082. 
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COG [Figure 6] and TIGRfam [Figure 7] whole ge- 
nome comparisons were carried out between 
Y412MC10 and draft and finished genomes of 
closely related organisms using IMG software [56]. 
The results of the COGs and TIGRfam whole ge- 
nome comparisons place Y412MC10 clearly among 
the Paenibacillus species, in agreement with the 
results from 16S analysis. The 16S analysis shows 
Y412MC10 is most closely related to P. vortex and 
P. sp. HGF5, both human isolates, and then to the 
two P. polymyxa sp. In both whole genome analyses, 
Y412MC10 is again most closely related to 
Paenibacillus vortex and Paenibacillus sp. HGF5. In 
the COG comparison, the other human isolate, P. sp. 
HGF7, is not closely related to P. vortex and P. sp. 



HGF5; in the TIGRfam comparison, HGF7 clades 
closely with Y412MC10, P. vortex and P. sp. HGF5. 
These results also suggest a mammalian, rather 
than environmental, ecosystem as the home of 
Y412MC10. 

To further understand the relationship between 
these organisms, whole genome alignments were 
performed using Mummer software to generate dot 
plot diagrams comparing pairs of genomes on the 
IMG website [57] using input DNA sequences di- 
rectly (NUCmer). The close relationship between 
the genome of Y412MC10 and the genomes of 
HGF5 and P. vortex is reflected in the high levels of 
homology and synteny seen (Figure 8, Figure 9) 
with these two human isolates. 



Paenfbadffus_ farvae_ /arvae_ B_ 3650 



Paenibacifius_pofym yxa_ E681 



PaentbacWus_ curdianoiyUcus_ YK9 



Paenibaciffus_ mudiagmosus_ K NP414 



I — Paentbadlius_sp JDR_2 



I — Paenibadflus_ vortex_ V453 



r Paenibadflus_ sp Y412MC10 



L Paenibadflus_ sp HGF5 



1 Paenibadilus_ sp HGF7 

$.01 

Figure 6. COGs whole genome comparison of selected strains. Comparison was per- 
formed as described in text; organisms and GenBank accession numbers are described 
in Figure 5. 
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Figure 7. TIGRfam whole genome comparison of selected strains. Comparison 
was performed as described in text; organisms and GenBank accession numbers 
are described in Figure 5. 



In comparison, whole genome alignment of 
Y412MC10 with the genomes of P. polymyxa and P. 
mucilaginosus show little homology or synteny be- 
tween Y412MC10 and the two soil organisms (Fig- 
ure 10, Figurell). 

The similarity between the rl6S sequences of P. 
lautus and Y412MC10 led us to examine if biochem- 
ical evidence suggested a similar habit for both. 
Bacillus lautus was first isolated from the intestinal 
tract of children [58]; later, the identity of the or- 
ganism was re-confirmed and the organism was 
reclassified and renamed Paenibacillus lautus. Ex- 
amination of the genome of Y412MC10 lends sup- 
port to the hypothesis that Y412MC10 also has an 
intestinal origin. An analysis of the carbohydrate 
active enzymes (CAZY [59]) shows very low levels 
of GH family 5, 6, 8, 9, 10, 11, and 48 as well as no 
CBM 2 or 3 members, suggesting an inability to sig- 
nificantly degrade cellulose and hemicellulose 
components of biomass. CAZy analysis shows a ge- 
nome enriched in GH29 and GH95 a-fucosidases; 



the genome is also enriched in GH38 and GH125 a- 
mannosidases and GH78 a-L-rhamnosidases. All 
these enzyme groups attack carbohydrate 
sidechains attached to eukaryotic glycoproteins; 
such glycoproteins are found in abundance in intes- 
tinal cell walls. CAZy analysis also shows a genome 
enriched in GH18 chitinases, GH28 
polygalacturonases, GH88 unsaturated glucuronyl 
hydrolases, GH105 unsaturated rhamno- 
galacturonyl hydrolases and pectate lyase (PL) 
family members. These enzymes attack dietary fi- 
ber components that would be resistant to diges- 
tion by most ruminant bacteria, allowing the organ- 
ism to scavenge sugars from pre-digested dietary 
sources. The enzymes required for bacillibactin 
production appear to be present in the genome of 
Y412MC10; bacillibactin is involved in iron acquisi- 
tion. Iron is in limited supply in intestinal environ- 
ments, but is present in large excess (approximate- 
ly 2 uJVl Fe 2+ ) in Obsidian hot spring. 



378 



Standards in Genomic Sciences 



Mead et al. 




Paenibacillus sp. Y412MC10 



Figure 8. Dot plot comparison of Y412MC10 and HGF5 performed on IMG website. HGF5 draft ge- 
nome contains 1 85 scaffolds. 
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Figure 9. Dot plot comparison of Y412MC10 and P. vortex performed on IMG website. P. vortex 
draft genome contains 56 scaffolds. 
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Figure 10. Dot plot comparison of Y412MC10 and P. polymyxa performed on IMG website. 




Paenibacillus sp. Y412MC10 



Figure 1 1 . Dot plot comparison of Y41 2MC1 0 and P mucilaginosus performed on IMG website. 



380 



Standards in Genomic Sciences 



Mead et al. 



This again argues for an intestinal origin for the 
organism. Y4112MC10 does not possess genes 
usually involved in detoxification of heavy metals 
and sulfide found in other hot springs organisms 
(unpublished results). The organism also lacks 
antibiotic production genes, indicating it comes 
from an environment with excess resources, typi- 
cal of the intestine. The growth temperature range 
and optimum of Y412MC10 is an excellent match 
for intestinal conditions, but a poor fit for the con- 
ditions of Obsidian hot spring, where tempera- 
tures average 79±4°C. Nitrogen fixing 
Paenibacillus have been isolated from the 
rhizosphere, including Paenibacillus brasilensis 
[60], and Paenibacillus zanthoxyli [61]. 
Paenibacillus lautus Y412MC10 has no nitrogen- 
fixing genes; these would be of no advantage for a 
free-living organism in an intestinal environment. 

Complex cooperative behaviors such as those seen 
with P. dendritiformis [62], and P. vortex [19] are 
not observed with Y412MC10; again, these behav- 
iors may be unnecessary for survival in the intes- 
tine. Formation of external matrices in liquid and 
solid cultures may be beneficial to Y412MC10 for 
survival; the matrix may allow attachment of the 
bacteria to intestinal mucosa. 
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